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NGINX and Caching, Why and How? 


Google Changed the Rules... 


“We want you to be able to get from one page to 
another as quickly as you turn-the page on a book.“ — 
Urz Ho. cle, KS e 


(RAS 


Why Cache? 


SERVER CRASHES 


e Performance 
e Consolidate Infrastructure 


e High Availability 


* Lower Downtime SERVER CRASHES EVERYWHERE 


NGINX Can help... 


It's the cache that powers some of the largest CDNs: 


Cache static and dynamic content 


Improve dynamic content performance 
with microcaching 

Serve "stale" content while revalidating 
in background for better performance 


Override or set Cache-Control headers >. = i 
Manage the cache easily with the we ae 
cache-purging API 
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NGINX perfect for both on-prem.. 


And the Cloud 


Netflix Cuts 


Costsand («G&G 2... 


Reaches Over 50 


The Challenge Solution Results 


m | I | | (0) n + Big 3 CDN vendors - Create own CDN using - Better experience for 
(Akamai, Limelight, & NGINX, FreeBSD, and users 


S u bsc ri b e rs l n 40 Level 3) struggling to other OSS software Reduced costs with hà 


keep Up MI demand 14 TB 1RU appliance CDN fees 


Co u ntri es * Rising costs offered to all ISPs for 


f Reduced network 
Lack of control ee traffic for partnering 
ISPs 


ra 


“We chose the NGINX web server software because of its proven 
| scalability and performance. By working directly with the core 

engineering team at NGINX, Inc., we received great development 
| support for our project." 


- Florance, Vice President of Content Delivery at Netflix 


PBS 


“The benefits we get from NGINX are speed and reliability. It's our firewall, 
cache, and proxy. It fronts our APIs and our public facing Websites. Pretty 
much every request that goes to PBS.org or its subdomains runs through 
NGINX. Really the biggest reason we use NGINX is to mitigate thundering 
herd and improve our application performance.” 


— Mike Howsden, Director of DevOps at PBS 


Q 


The Basic Principle 


1) HTTP Request: 2) NGINX hashes the cache key, and checks if it already exists. If 
GET /index.html It does not or is expired, NGINX will send the request to the upstream server 


Get /index.html 
ZA 
oT Ie 
LE = UD 
Get /index.html 


3) Origin server 
responds 


4) NGINX caches the response to disk, 
places the hash in memory 
and response is served to client 


Content Caching with NGINX is Simple 


Definition: Sets the path and other parameters of a cache. Cache data are stored in files. 
The file name in a cache is a result of applying the MD5 function to the cache key. 


http { 


proxy_cache_path /tmp/nginx/micro_cache/ keys_zone=Large_cache: 10m 
max_size=300g inactive=14d; 


Definition: Defines a key for caching. Used in the proxy_cache_path directive. 


server { 


proxy_cache_key $scheme$proxy_host$request_ur1 ; 


Definition: Defines a shared memory zone used for caching. The same zone can be used 
in several places. 


location /video { 


proxy_cache large_cache; 


location /images { 


proxy_cache small_cache; 


Definition: Sets caching time for different response codes. 


Location / { 


proxy_cache_valid 200 301 302 5m; 
} 


location ~* \.Cjpglpnglgiflico)$ { 


proxy_cache_valid any 1h; 


server { 


Location / { 
‘ees Basic Caching 
proxy_pass http://backend.com; 


} 


location 4~ /images { 


proxy_cache cache; 
proxy_cache_ valid 200 301 302 12h; 


# ps aux | grep nginx 
root 14559 0.0 ‘ 53308 3360 ? ; : master process /usr/ 
sbin/ .conf 


27880 A 3 53692 2724 ? : : worker process 
27881 a ` 53692 2724 ? H : worker process 
27882 n : 53472 2876 ? a : cache manager process 
27883 . 5 53472 2552 ? : : cache loader process 


+ Cache Manager - activated periodically to check the state of the cache. If the cache size 
exceeds the limit set by the max_size parameter to the proxy_cache_path directive, the 
cache manager removes the data that was accessed least recently, as well as the cache 
considered inactive. 


« Cache Loader - runs only once, right after NGINX starts. It loads metadata about previously 
cached data into the shared memory zone. 
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Cache Headers 


e Cache-Control - used to specify directives for caching mechanisms in both, requests and 
responses. (e.g. Cache-Control: no-cache or Cache-Control: private) 


e Expires - contains the date/time after which the response is considered stale. If there is a Cache- 
Control header with the "max-age" or "s-max-age" directive in the response, the Expires header is 
ignored. (e.g. Expires: Wed, 21 Oct 2015 07:28:00 GMT) 

e Last-Modified - contains the date and time at which the origin server believes the resource was last 
modified. HTTP dates are always expressed in GMT, never in local time. Less accurate than the ETag 
header (e.g. Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT) 


e ETag -is an identifier (or fingerprint) for a specific version of a resource. (e.g. ETag: “58efdcd0-268") 


: © 
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Override Cache-Control Headers 


Tip: By default NGINX will honor all Cache-Control headers from the origin server, in turn not caching 
responses with Cache-Control set to Private, No-Cache, No-Store or with Set-Cookie in the response 
header. 


Using proxy_ignore_headers you can disable processing of certain response header fields from the 
proxied server. 


Location A- /wordpress { 


proxy_cache cache; 
proxy_ignore_headers Cache-Control; 
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Fine Tuning the Cache 


Types of Content 


Static Content : ` Dynamic Content : ` User Content 
- Images : + Blog Posts | + Shopping Cart 
- CSS | : + Status | ' + Unique Data 


- Simple HTML | | + API Data (Maybe?) `- Account Data 


Easy to cache 3 : Micro-cacheable! 3 | Cannot Cache 


Documentation IN) 


http { 


upstream backend 
keepalive 20; Enable keepalives on upstream 
server 127.0.0.1: 8080; 


} 


proxy_cache_path /var/nginx/micro_cache levels=1:2 keys_zone=micro_cache:10m 


server { 


Listen 80; 


proxy_cache micro_cache; a = Fr 
roxy_cache_valid any 1s; hl alte ae 
p y status with a 1 second value 
location / { 
proxy_http_version 1.1; Set required HTTP version and 
= mm D H 
proxy_set_header Connection ; pass HTTP headers for keepalives 


win, 


proxy_set_header Accept-Encoding ; 
proxy_pass http://backend; 


Final Touches 
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Definition: Allows starting a background subrequest to update an expired cache item, while a stale 
cached response is returned to the client. Note that it is necessary to allow the usage of a stale cached 
response when it is being updated. 


Location / { 


proxy_cache_background_update on; 


Documentation 
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proxy_cache_lock 


Definition: When enabled, only one request at a time will be allowed to populate a new cache element 
identified according to the proxy_cache_key directive by passing a request to a proxied server. 


Other requests of the same cache element will either wait for a response to appear in the cache or the 
cache lock for this element to be released, up to the time set by the proxy_cache_lock_timeout directive. 


Related: See the following for tuning... 
- proxy_cache_lock_age, 
+ proxy_cache_lock_timeout 


Documentation 
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Definition: Determines in which cases a stale cached response can be used during communication with 
the proxied server. 


location /contact-us { 


proxy_cache_use_stale error timeout http_500 http_5@2 http_503 http_504; 


Documentation 


http { 


upstream backend { 
keepalive 20; 
server 127.0.0.1:8080; 
} 


proxy_cache_path /var/nginx/micro_cache Levels=1:2 keys_zone=micro_cache: 10m 
max_size=100m inactive=600s; 


server { 
listen 80; 
proxy_cache micro_cache; 


proxy_cache_valid any 1s; 
proxy_cache_background_update on; 


proxy_cache_lock on; s eer 
proxy_cache_use_stale updating; Final optimization 


location / { 


proxy_http_version 1.1; 
proxy_set_header Connection 
proxy_set_header Accept-Encoding : 
proxy_pass http://backend; 


um, 
, 


mu 


Further Tuning and Optimization 
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proxy_cache_revalidate 


Definition: Enables revalidation of expired cache items using conditional GET requests with the “If- 
Modified-Since” and “If-None-Match” header fields. 


Proxy Cache [NGINX] Origin Server 
If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT — Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT 
If-None-Match: “686897696a7c876b7e" — ETag: “686897696a7c876b7e” 


Documentation 
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Definition: Sets the number of requests after which the response will be cached. This will help with disk 
utilization and hit ratio of your cache. 


location ~* /legacy { 


proxy_cache_min_uses 5; 


Documentation 
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Definition: NGINX only caches GET and HEAD request methods by default. Using this directive you can 
add additional methods. 


If you plan to add additional methods consider updating the cache key to include the $request_method 
variable if the response will be different depending on the request method. 


Location ~* /data { 


proxy_cache_methods GET HEAD POST; 


Documentation 


Tips and Tricks! 


Tip: The more relevant information in your log the better. When troubleshooting you can easily add the proxy 
cache KEY to the log_format for debugging. For a list of all variables see the “Alphabetical index of 
variables” on nginx.org. 


r 


log_format main \rid="$request_id" pck="$scheme://$proxy_host$request_uri" 
‘ucs="$upstream_cache_status" ' 
"stte="$server_name" server="$host” dest_port="$server_port 
"dest_ip="$server_addr" src="$remote_addr" src_ip="$realip_remote_addr 
"user="$remote_user" time_local="$time_local" protocol="$server_protocol" ' 
"status="$status" bytes_out="$bytes_sent" ' 
"bytes_in="$upstream_bytes_received" http_referer="$http_referer" 


" Y 


11 À 


Y 


Y 


"http_user_agent="$http_user_agent" nginx_version="$nginx_version" 
"http_x_forwarded_for="$http_x_forwarded_for" ' 
"http_x_header="$http_x_header" uri_query="$query_string" uri_path="$uri" 


J 


" Y 


"http_method="$request_method" response_time="$upstream_response_time 
"cookie="$http_cookie" request_time="$request_time" '; 
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Tip: Using the add_header directive you can add useful HTTP response headers allowing you to debug 
your NGINX deployment rather easily. 


server { 


# add HTTP response headers 
add_header CC-X-Request-ID $request_id; 


add_header X-Cache-Status $upstream_cache_status; 


Tip: Use cURL or Chrome developer tools to grab the request ID or other various headers useful for 
debugging. 


# curl -I 127.@.0.1/images/hawaii. jpg 
HTTP/1.1 200 OK 

Server: nginx/1.11.10 

Date: Wed, 19 Apr 2017 22:20:53 GMT 
Content-Type: image/jpeg 
Content-Length: 21542868 


Connection: keep-alive 

Last-Modified: Thu, 13 Apr 2017 20:55:07 GMT 
ETag: "58efe5ab-148b7d4" 

OS-X-Request-ID: 1e7ae2cf83732e8859bc3e38df912edl 
CC-X-Request-ID: d4a5f7a8d25544b1409c351a22f42960 
X-Cache-Status: HIT 

Accept-Ranges: bytes 


Use NGINX Plus for More Insight... 


NGUX+ 


Caches 


Zone State Memory usage Max size Used Disk usage Traffic Hit Ratio 


Served Written Bypassed 


http_cache © fh» 512 MB 756KB fb% 237GB 127MB 149MB 


Consider NGINX Controller for Historical Data 


Overview Configure Dashboards v Analyzer Alerts A 


IN} Controller 
Systems AWS_EC2_Instance_2 © UTC-07 + 14:48 v 
nginx-plus-r14-p1 System 
NGINX Connections/s 184.26 
06/20/18 14:00 


© AWS_EC2_Instance_2 


200.00 


© AWS_EC2_Instance_1 


100.00 


0.00 


Ô http.conn.accepted 


NGINX Current Connections 


100 


+ New Instance 


NGINX Controller Metrics 


Cache Metrics 


* nginx.cache.bypass 

e nginx.cache.expired 

* nginx.cache.hit 

* nginx.cache.miss 

* nginx.cache.revalidated 
* nginx.cache.stale 


* nginx.cache.updating 


Type: counter, integer 

Description: Various statistics about NGINX cache usage. 
Source: access.log (requires custom log format) 
Variable: $upstream cache status 
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NGINX Controller Metrics 


Cache Zone Metrics 


« plus.cache.bypass 

+ plus.cache.bypass.bytes 

« plus.cache.expired 

+ plus.cache.expired.bytes 

+ plus.cache.hit 

« plus.cache.hit.bytes 

+ plus.cache.miss 

+ plus.cache.miss.bytes 

« plus.cache.revalidated 

+ plus.cache.revalidated.bytes 
+ plus.cache.size 

+ plus.cache.stale 

+ plus.cache.stale.bytes 

« plus.cache.updating 

+ plus.cache.updating.bytes 


Type: counter, integer; counter, bytes 
Description: Various statistics about NGINX Plus cache usage. 
Source: NGINX Plus status API 
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Tip: 


If you want to disregard the cache and go strait to the origin for a response, you can use the 
proxy_cache_bypass directive. 


Location / { 


proxy_cache cache; 
proxy_cache_bypass $cookie_nocache $arg_nocache $http_nocache; 
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proxy_cache_purge 


Definition: Defines conditions under which the request will be considered a cache purge request. If at least 
one value of the string parameters is not empty and is not equal to “0” then the cache entry with a 
corresponding cache key is removed. 


The result of successful operation is indicated by returning the 204 (No Content) response. 


NGINX Plus only feature 


Documentation 
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Caching is Not Just for HTTP 


Tip: NGINX can also be used to cache other backends using their unique cache 
directives. 
(e.g. fastcgi_cache, uwsgi_cache and scgi_cache) 


Alternatively, NGINX can also be used to retrieve content directly from a memcached 
server. 


HTTP 
Co” we FastCGI 


AR Le ie oe” UWSGI 
LL. SCGI 


Memcached 


y 


y 


CB 


Architecting for High Availability 


Two Approaches 
- Sharded (High Capacity) 
- Shared (Replicated) 
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Shared Cache Clustering 


& —> M 


Highly Available 


Cache Cluster with Failover Origin Server 


Tip: If your primary goal is to achieve high availability while minimizing load on the origin servers, this scenario 
provides a highly available shared cache. HA cluster should be Active/Passive configuration. N, 
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Tip: 


and Failover 


— + à >f 


Origin 


Origin 


In the event of a failover there is no loss in cache and the origin does not suffer unneeded proxy requests. 


Q 
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Tip: 


Sharding Your Cache 


St 


® =) un 
© 


a 


_ > 


Each resource is cached 
on only one server 


Consistent-hash 
load balancing 


=-=- 


LB Tier Cache Tier Origin Server 


If your primary goal is to create a very high-capacity cache, shard (partition) your cache across multiple 
servers. This in turn maximizes the resources you have while minimizing impact on your origin servers 
depending on the amount of cache servers in your cache tier. 


Q 


Combined Load Balancer and Cache 


` 

! Traffic is internally 

| load-balanced across 
ı the cache servers 


Incoming traffic 
distributed using RR 
DNS, keepalived, etc 


Combined LB 
and Cache Tier 


Tip: Alternatively, It is possible to consolidate the load balancer and cache tier into one with the use of a 
various NGINX directives and parameters. 
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Multi-Tier with “Hot Cache” 


Incoming traffic 
distributed using RR 
DNS, keepalived, etc 


[111 
tit 
au 


A j Tit (TTI 
LV au au 
\ / 4 
mm heommmmmmmmmmbommmmm………’ 


distributed using 
consistent hash 


Qu mn 
— um 


Load Balancing jg. #§ 4§ ee 
and Hot Cache Tier Cache Tier 
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Tip: If needed, a “Hot Cache Tier” can be enabled on the load balancer layer which will give you the same high 
capacity cache and provide a high availability of specific cached resources. N, 


EBOOK 


High-Performance 
Caching with NGINX 
and NGINX Plus 


ORMANCE Implement high-performance caching and cache 


pich- 
TOWING x PUS clustering for your web applications with NGINX 
with 


NX anon 


and NGINX Plus 


DOWNLOAD FOR FREE 


NGiWX — 


Thank you for coming! 


~— 


Kevin Jones 
kevin.jones@nginx.com 


nginx.com | @nginxinc 


Tip: Using the hash load balancing algorithm, we can specify the proxy cache key. This allows each resource to 
be cached on only one backend server. 


upstream cache_servers { 
hash $scheme$proxy_host$request_uri consistent; 


server prod.cachel.host; 
server prod.cache2.host; 
server prod.cache3.host; 
server prod.cache4.host; 


Example Cache Purge Configuration 


Tip: Using NGINX Plus, you can issue unique request methods to invalidate the cache 


proxy_cache_path /tmp/cache keys_zone=mycache:10m levels=1:2 inactive=60s; 


map $request_method $purge_method { 
PURGE 1; 
default Q; 

} 


server { 
listen 80; 
server_name www.example.com; 


Location / { 
proxy_pass http://localhost: 8002; 


proxy_cache mycache; 


proxy_cache_purge $purge_method; 


Primary Cache Server 


proxy_cache_path /tmp/mycache keys_zone=mycache: 10m; 
server { 
Listen 80; 


proxy_cache mycache; 
proxy_cache_valid 200 15s; 


Location / { 
proxy_pass http://secondary; 


# secondary 
# origin 


proxy_cache_path /tmp/mycache keys_zone=mycache: 10m; 


server { 
Listen 80; 


proxy_cache mycache; 
proxy_cache_valid 200 15s; 


Location / { 
proxy_pass http://origin; 
} 
} 


upstream origin { 
server 192.168.56.12; 
} 


Headlines go here. 
Two line maximum length. 
e This is where bullet points about NGINX should go. 


° You can easily change this to a non-bulleted list by clicking on the “Bullets” 
icon in the "Home" ribbon. 


- Or you can apply a new, non-bulleted layout to this slide 


e One way to then make all the content on this page the same size is to use the "Indent Less” button 
on the subordinate bullet points. 


e You can start a new series of bullets by clicking on the 
"Indent Less” button in the "Home" ribbon. 
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Headlines go here. 
Two line maximum length. 


e This is where bullet e This is where bullet 
points about NGINX go. points about NGINX go. 

° You can easily change this to a ° You can easily change this to a 
non-bulleted list by clicking on non-bulleted list by clicking on 
the "Bullets" icon in the "Home" the “Bullets” icon in the "Home" 
ribbon. ribbon. 

- Third level bullet - Third level bullet 
e Fourth level bullet e Fourth level bullet 


Headline goes here. 


e This is where bullet points 
about NGINX go. 


° You can easily change this to a 
non-bulleted list by clicking on 
the “Bullets” icon in the "Home" 
ribbon. 

- Third level bullet 


e Fourth level bullet 
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Headline goes here. 


Code code code code code 
Code code code code code 
Code code code code code 


Code code code code code 


Code code code code code 
Code code code code code 
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Source | Notes 


1. Use this page for source information 
2. And/or notes 


3. This text box is set-up to be two columns, and the 
text will wrap after each entry as shown here. 


4. Four 
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